计算机与现代化 ›› 2013, Vol. 1 ›› Issue (5): 7-9.doi: 10.3969/j.issn.1006-2475.2013.05.002

• 算法分析与设计 • 上一篇    下一篇

基于均值的谱聚类特征向量选择算法

王森洪1,戴青云2,曹江中1,朱婧1   

  1. 1.广东工业大学信息工程学院,广东广州510006; 2.广东工业大学科技处,广东广州510006
  • 收稿日期:2012-12-31 修回日期:1900-01-01 出版日期:2013-05-28 发布日期:2013-05-28

Eigenvector Selection Algorithm for Spectral Clustering Based on Mean

WANG Sen-hong1, DAI Qing-yun2, CAO Jiang-zhong1, ZHU Jing1   

  1. 1. School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China; 2. Science and Technology Department, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2012-12-31 Revised:1900-01-01 Online:2013-05-28 Published:2013-05-28

摘要: 在数据聚类当中,谱聚类是最流行的方法之一,其性能取决于所选取相关图的拉普拉斯(Laplacian)矩阵的特征向量。对于一个K类问题,Ng-Jordan-Weiss(NJW)谱聚类算法通常采用Laplacian矩阵的前K个最大特征值对应的特征向量作为数据的一种表示。然而,对于某些分类问题,这K个特征向量不一定能够很好地体现原始数据的信息。本文提出一种基于均值的谱聚类特征向量选择算法。该算法首先得出图的Laplacian矩阵的前3K个最大特征值的均值,然后选取K个离均值最近的特征值所对应的特征向量。相比传统谱聚类算法,该算法在UCI数据集上获得了较好的聚类性能。

关键词: 谱聚类, Laplacian矩阵, 特征值, 均值, 特征向量选择

Abstract: Spectral clustering is one of the most popular methods for data clustering, and its performance is determined by the quality of the eigenvectors of the related graph Laplacian matrix. For a K clustering problem, Ng-Jordan-Weiss(NJW) spectral clustering method adopts the eigenvectors corresponding to the K largest eigenvalues of the Laplacian matrix derived from a dataset as a novel representation of the original data. However, these K eigenvectors can not always reflect the information of the original data for some classification problems. This paper proposes an eigenvector selection method for spectral clustering. First this method calculates the mean of the 3K largest eigenvalues from Laplacian matrix, and then select K eigenvectors whose eigenvalues are the nearest the mean eigenvalue. Experiments show that it can get better cluster results on UCI datasets and obtain more satisfying performance than classical spectral clustering algorithms.

Key words: spectral clustering, Laplacian matrix, eigenvalue, mean, eigenvector selection

中图分类号: